Authorship Attribution Based on Specific Vocabulary
نویسندگان
چکیده
منابع مشابه
Authorship Attribution
Authorship attribution, the science of inferring characteristics of the author from the characteristics of documents written by that author, is a problem with a long history and a wide range of application. Recent work in “non-traditional” authorship attribution demonstrates the practicality of automatically analyzing documents based on authorial style, but the state of the art is confusing. An...
متن کاملMaximal Repeats Enhance Substring-based Authorship Attribution
This article tackles the Authorship Attribution task according to the language independence issue. We propose an alternative of variable length character n-grams features in supervised methods: maximal repeats in strings. When character ngrams are by essence redundant, maximal repeats are a condensed way to represent any substring of a corpus. Our experiments show that the redundant aspect of n...
متن کاملComputer-Based Authorship Attribution Without Lexical Measures
The most important approaches to computer-assisted authorship attribution are exclusively based on lexical measures that either represent the vocabulary richness of the author or simply comprise frequencies of occurrence of common words. In this paper we present a fully-automated approach to the identification of the authorship of unrestricted text that excludes any lexical measure. Instead we ...
متن کاملUnsupervised authorship attribution
We describe a technique for attributing parts of a written text to a set of unknown authors. Nothing is assumed to be known a priori about the writing styles of potential authors. We use multiple independent clusterings of an input text to identify parts that are similar and dissimilar to one another. We describe algorithms necessary to combine the multiple clusterings into a meaningful output....
متن کاملAutomatic Authorship Attribution
In this paper we present an approach to automatic authorship attribution dealing with real-world (or unrestricted) text. Our method is based on the computational analysis of the input text using a text-processing tool. Besides the style markers relevant to the output of this tool we also use analysis-dependent style markers, that is, measures that represent the way in which the text has been pr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: ACM Transactions on Information Systems
سال: 2012
ISSN: 1046-8188,1558-2868
DOI: 10.1145/2180868.2180874